Knowledgebot: Neuroknowledge based Complimentary Learning Model for Question Answering Systems
نویسندگان
چکیده
It is well known that two complementary learning modules are important to achieve human-level language understanding: one gradually acquires structured representations of knowledge from language, the other quickly learns the episodic memory composed of individual knowledge. In this paper, we suggest a new machine learning model which combines symbolic and neuronal approaches to construct the complementary learning model. The suggested model extracts symbolic knowledge from natural language sentences, then the symbolic knowledge is embedded into a real-valued continuous vector space, a neural representation. The neural representation implies the meaning and the correlations between the symbolic knowledge, also generalize the patterns among the knowledge. At the same time, the model rapidly learns to predict a specific knowledge which is supposed to be based on the reasoning. As an application of the suggested method, we conduct a challenging problem, a question and answering, which needs to understand the context of the language input and reason answer for the given question based on the context. Complementary learning system (CLS) is a great theoretical base to understand the mechanism of human learning and memory. According to the theory, in neocortex, semantic knowledge is gradually constructed using information from episodic memory. At the same time, in hippocampus, the episodic memory is rapidly constructed using semantic knowledge structure[?], [?], [?]. Getting intuitions from the CLS theory, we consider a model combining a symbolic approach (from Artificial intelligence) and a neural representation approach (from the recent machine learning) to build complementary system for learning human language. In this paper, we suggest a new model to construct the neuroknowledge based complementary learning architecture to understand and reason about knowledge written in human language. For demonstration, we will show a novel challenging problem, question and answering task which needs reasoning answer from the text input and question. A. Model description The suggested model consists of four parts: a symbolic knowledge extraction module, a neuroknowledge representation module, an episodic memory module, an answer module. In the symbolic knowledge extraction module, the symbolic knowledge triplet, , is automatically extracted from the input text. Then the neuroknowledge representation module learns generalized neural representation of each knowledge triplet. The episodic memory module learns to predict to pick specific knowledge from a trigger, question for example, then finally output will be comes out from the answer module. A high-level illustration of the model is shown in Figure ??. 1) Symbolic knowledge extraction module: To extract symbolic knowledge automatically from the text, we use open information extraction (OpenIE) which can identify entities (subject and object) and relations from natural sentences [?]. For example, given the sentence, ”McCain, fought hard against Obama, but finally lost the election,” an OpenIE system may extract two triplets, , and . Using this technique, multiple symbolic knowledge triplet are obtained from natural text input. 2) Neuroknowledge representation module: From the symbolic knowledge triplet, the neuroknowledge representation module learns generalized neural representations of each symbolic knowledge triplet. There are several approaches embedding knowledge triplets to neural representation, but most of them more focused on entity embedding which toward to reflected relation in a fixed number of relation environment [?], [?]. In the suggested model, we use factored high-order Boltzmann machine to learn neural representation of knowledge triplets. The factored high-order Boltzmann machine is shown to have nature of capturing correlational structure among inputs [?], [?]. Using this property, we feed Word2Vec [?] representation of triplet as an input, and use hidden representation as neuroknowledge representation. Specifically, to obtain neuroknowledge representation, we firstly embed each component of a triplet, to continuous vector space using word2vec as follows. es = word2vec(subject) er = word2vec(relation) eo = word2vec(object) (1) 81 2016 nternat onal Sy po on Percept on, Act on, and Co n t e Sy te October 27-28, 2016 Beyond AlphaGo PACS2016
منابع مشابه
دستهبندی پرسشها با استفاده از ترکیب دستهبندها
Question answering systems are produced and developed to provide exact answers to the question posted in natural language. One of the most important parts of question answering systems is question classification. The purpose of question classification is predicting the kind of answer needed for the question in natural language. The literature works can be categorized as rule-based and learning...
متن کاملA New Statistical Model for Evaluation Interactive Question Answering Systems Using Regression
The development of computer systems and extensive use of information technology in the everyday life of people have just made it more and more important for them to make quick access to information that has received great importance. Increasing the volume of information makes it difficult to manage or control. Thus, some instruments need to be provided to use this information. The QA system is ...
متن کاملارائه یک مدل احتمالاتی جهت تعیین انسجام متن در سیستم های پرسش و پاسخ تعاملی
Evaluation plays an important role in interactive question answering systems like many computational linguistics fields. The coherence between the questions and the answers exchanged between the user and the system is one of the important criteria in evaluating these systems. In this paper, a new approach to determine the degree of coherence of generated text by the IQA systems is presented. Th...
متن کاملUsing Generalized Language Model for Question Matching
Question and answering service is one of the popular services in the World Wide Web. The main goal of these services is to finding the best answer for user's input question as quick as possible. In order to achieve this aim, most of these use new techniques foe question matching. . We have a lot of question and answering services in Persian web, so it seems that developing a question matching m...
متن کاملLearning Strategies for Open-Domain Natural Language Question Answering
This work presents a model for learning inference procedures for story comprehension through inductive generalization and reinforcement learning, based on classified examples. The learned inference procedures (or strategies) are represented as of sequences of transformation rules. The approach is compared to three prior systems, and experimental results are presented demonstrating the efficacy ...
متن کامل